AITopics | kruskal rank

Collaborating Authors

kruskal rank

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Identification of Multivariate Measurement Error Models

Hu, Yingyao

arXiv.org Machine LearningDec-3-2025

Multivariate continuous latent variables arise in numerous empirical settings in economics, psychology, marketing, epidemiology, and the social sciences. Examples include multidimensional skills, cognitive factors, latent preferences, health indices, productivity components, and risk attitudes. In practice, such latent constructs are rarely observed directly; instead, researchers rely on multiple imperfect measurements that contain potentially correlated forms of noise. The resulting measurement-error problem is especially severe when the latent variable is multidimensional: each measurement typically captures only a low-dimensional projection of the latent vector, and the noise may exhibit dependence or heterogeneity across measurement channels. As a result, classical approaches to continuous measurement error, which often rely on injectivity, offer limited guidance. This paper develops a identification strategy tailored specifically to multivariate continuous latent variables measured with noise. The key innovation is to combine tools from multi-linear algebra--specifically the uniqueness properties of so-called CP tensor decompositions-- with the multivariate extension of Kotlarski's identity, a powerful deconvolution result based on characteristic functions. The starting point is the observation that third-order cross-moments of three separate measurements form a three-way moment tensor whose CP decomposition is governed by the latent factor loading matrices. By invoking Kruskal's theorem (Kruskal, 1977a), I show that these loadings are generically identifiable even when each measurement matrix is rank-deficient and--even more surprisingly--when the stack of all measurement matrices is non-injective.

identification, kruskal rank, rank condition, (15 more...)

arXiv.org Machine Learning

2512.0297

Country: North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.66)

Add feedback

Efficient Algorithms for Verifying Kruskal Rank in Sparse Linear Regression and Related Applications

Zhou, Fengqin

arXiv.org Artificial IntelligenceMar-6-2025

We present novel algorithmic techniques to efficiently verify the Kruskal rank of matrices that arise in sparse linear regression, tensor decomposition, and latent variable models. Our unified framework combines randomized hashing techniques with dynamic programming strategies, and is applicable in various settings, including binary fields, general finite fields, and integer matrices. In particular, our algorithms achieve a runtime of $\mathcal{O}\left(dk \cdot \left(nM\right)^{\lceil k / 2 \rceil}\right)$ while ensuring high-probability correctness. Our contributions include: A unified framework for verifying Kruskal rank across different algebraic settings; Rigorous runtime and high-probability guarantees that nearly match known lower bounds; Practical implications for identifiability in tensor decompositions and deep learning, particularly for the estimation of noise transition matrices.

algorithm 1, kruskal rank, matrix, (13 more...)

arXiv.org Artificial Intelligence

2503.04986

Country:

North America > United States > Texas (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)

Add feedback

Memorization Capacity of Multi-Head Attention in Transformers

Mahdavi, Sadegh, Liao, Renjie, Thrampoulidis, Christos

arXiv.org Artificial IntelligenceOct-12-2023

Transformers have become the go-to architecture for language and vision tasks, yet their theoretical properties, especially memorization capacity, remain elusive. This paper investigates the memorization abilities of multi-head attention mechanisms, examining how many example sequences they can memorize, as a function of the number of heads and sequence length. Motivated by experimental findings on vision transformers, we introduce novel assumptions about the linear independence of input data, distinct from the commonly used general-position assumption. Under these assumptions, we demonstrate that an attention layer with $H$ heads, dimension $d$, and context size $n < d$, featuring $\Theta(Hd^2)$ parameters, can memorize $\Omega(Hn)$ examples. Our analysis sheds light on how different attention heads handle various example sequences, aided by the softmax operator's saturation property. We validate our findings through experiments on synthetic data.

assumption, assumption 1, memorization, (14 more...)

arXiv.org Artificial Intelligence

2306.0201

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia (0.04)
North America > Canada > Ontario (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Learning from Multiple Noisy Partial Labelers

Yu, Peilin, Ding, Tiffany, Bach, Stephen H.

arXiv.org Machine LearningJun-8-2021

Programmatic weak supervision creates models without hand-labeled training data by combining the outputs of noisy, user-written rules and other heuristic labelers. Existing frameworks make the restrictive assumption that labelers output a single class label. Enabling users to create partial labelers that output subsets of possible class labels would greatly expand the expressivity of programmatic weak supervision. We introduce this capability by defining a probabilistic generative model that can estimate the underlying accuracies of multiple noisy partial labelers without ground truth labels. We prove that this class of models is generically identifiable up to label swapping under mild conditions. We also show how to scale up learning to 100k examples in one minute, a 300X speed up compared to a naive implementation. We evaluate our framework on three text classification and six object classification tasks. On text tasks, adding partial labels increases average accuracy by 9.6 percentage points. On image tasks, we show that partial labels allow us to approach some zero-shot object classification problems with programmatic weak supervision by using class attributes as partial labelers. Our framework is able to achieve accuracy comparable to recent embedding-based zero-shot learning methods using only pre-trained attribute detectors

learning, partial label, plf, (12 more...)

arXiv.org Machine Learning

2106.0453

Country:

North America > United States (0.28)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Government (0.68)
Health & Medicine > Health Care Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

Anandkumar, Animashree, Hsu, Daniel, Janzamin, Majid, Kakade, Sham

arXiv.org Machine LearningAug-13-2013

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabulary. While general overcomplete topic models are not identifiable, we establish generic identifiability under a constraint, referred to as topic persistence. Our sufficient conditions for identifiability involve a novel set of "higher order" expansion conditions on the topic-word matrix or the population structure of the model. This set of higher-order expansion conditions allow for overcomplete models, and require the existence of a perfect matching from latent topics to higher order observed words. We establish that random structured topic models are identifiable w.h.p. in the overcomplete regime. Our identifiability results allows for general (non-degenerate) distributions for modeling the topic proportions, and thus, we can handle arbitrarily correlated topics in our framework. Our identifiability results imply uniqueness of a class of tensor decompositions with structured sparsity which is contained in the class of Tucker decompositions, but is more general than the Candecomp/Parafac (CP) decomposition.

machine learning, natural language, topic model, (19 more...)

arXiv.org Machine Learning

1308.2853

Country: North America > United States (0.92)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback